Speech Recognition with Hierarchical Codebook Search
نویسنده
چکیده
3 1 Specifications 4 1.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 The Decimation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Preconditions 6 2.1 Distortion and Distance Measurement . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Random Generated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Number of Initial Codebook Vectors . . . . . . . . . . . . . . . . . . . . . . . 7 3 Solution Process 7 4 Program Description 8 4.1 Training, Test and Codebook Vector Generation . . . . . . . . . . . . . . . . . 8 4.2 Hierarchical Codebook Generation with Decimation . . . . . . . . . . . . . . . 9 4.3 Hierarchical Codebook Generation with LBG . . . . . . . . . . . . . . . . . . 11 4.4 Test and Quantization Programs . . . . . . . . . . . . . . . . . . . . . . . . . 11 5 Evaluation 11 5.1 General Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1.1 Equality of a Vector Quantization with a Full and a Hierarchical Codebook 12 5.1.2 Degenerated Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5.2 Evaluation for Different Values of Parameter ft . . . . . . . . . . . . . . . . . 16 5.3 Evaluation for Different Values of Parameter B . . . . . . . . . . . . . . . . . 18 5.4 Application to a Speech Recognition Task . . . . . . . . . . . . . . . . . . . . 21
منابع مشابه
A fast search method of speaker identification for large population using pre-selection and hierarchical matching
Performance of search during matching phase in a speaker identification system realized through vector quantization (VQ) is investigated in this paper. Voice of each person is recorded in a office room with personal computers. LPC−cepstrum is selected as feature vector. In order to gain higher success rate of identification, it is necessary to use larger size codebook for each person. Consequen...
متن کاملLearning of Invariant Object Recognition in a Hierarchical Network
In this paper we propose an object recognition system implementing three basic principles: forming of temporal groups of features, learning in a hierarchical structure and using feedback for predicting future input. It gives very good results on public available datasets. Precondition for successful learning is that training images are presented to the system in an appropriate order such that i...
متن کاملA study on LVCSR and keyword search for tagalog
We describe a state-of-the-art large vocabulary continuous speech recognition (LVCSR) and keyword search (KWS) system trained on roughly 70 hours of conversational telephone speech. Using the Kaldi speech recognition toolkit, we investigate several aspects: for the acoustic front-end, we analyze the use of mel-frequency cepstral coefficients (MFCC), pitch and probability-of-voicing (PoV), and d...
متن کاملRobust Speech Recognition by DHMM with A Codebook Trained by Genetic Algorithm
This paper uses genetic algorithms to train a codebook for the modeling of Discrete Hidden Markov Model (DHMM) applied to speech recognition. The GA-trained DHMM is then used to increase the recognition rate for Mandarin speeches. Vector quantization based on a codebook is a fundamental process to recognize the speech signal by DHMM. A codebook will be first trained by genetic algorithms throug...
متن کاملFuzzy Vector Quantization on the Modeling of Discrete Hidden Markov Model for Speech Recognition
This paper applies fuzzy vector quantization (FVQ) to the modeling of Discrete Hidden Markov Model (DHMM) and then to improve the speech recognition rate for the Mandarin speech. Vector quantization based on a codebook is a fundamental process to recognize the speech signal by DHMM. A codebook will be first trained by K-means algorithms using Mandarin training speech. Then, based on the trained...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014